-
Notifications
You must be signed in to change notification settings - Fork 684
Feat S3 Transfer Manager v2 GetObject/DownloadObject #2996
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
4021ed1
to
dd3bcc7
Compare
} | ||
|
||
// Queue the next range of bytes to read. | ||
ch <- getChunk{w: g.w, start: g.pos - g.offset, withRange: g.byteRange()} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So as far as I can tell, you're now doing concurrent downloads but you're buffering the entire object in memory before returning it to the caller - which is all well and good until someone goes to try and pull a multi-gigabyte object.
We're basically going to have to return a reader that "composes" the response bodies from inner getPart requests in order. That will allow you to stream the response back to the caller without that extra memory pressure. You might have a "scratch" buffer for that internal composition but naturally it won't be the full object size.
f404d3f
to
0ce3dca
Compare
// Deprecated: This field is handled inconsistently across AWS SDKs. Prefer using | ||
// the ExpiresString field which contains the unparsed value from the service | ||
// response. | ||
Expires time.Time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to address this now but I question whether we should have the time.Time
version of this in the transfermanager. We should be using these APIs as an opportunity to get rid of old baggage like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've discussed the nuances of exposing the GetObject reader offline, you're free to address that now or ship and address in a followup.
@@ -1,193 +0,0 @@ | |||
package testing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
General question, but what's the expected status of the old transfer manager? if we're not completely getting rid of it, do we still keep testing for it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so, we just leave it here and make any change if there's any checksum related update
} | ||
}, | ||
}, | ||
"parts download with part number input": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this actually check that we're doing part object? It seems like if the data is small enough, this is the equivalent of just getting everything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, this test case verifies that we single download the whole object anytime user passes a part number
g.r.setCapacity(capacity) | ||
} | ||
|
||
if i == capacity { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we hit this case, wouldn't we deadlock if g.r.getRead() != capacity
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
g.r.read
would be increased by reader in another goroutine, that's why we continue the loop to re-check g.r.getRead()
at line 668
3706344
to
a3113eb
Compare
* recommit transfer manager v2 files * change pool to store slice pointer * add integ test for putobject * update go mod * minor changes for v0.1.0 * update tags * update tags * update integ test dependency version * change err var name * update go mod * change input/output type comment * minor change --------- Co-authored-by: Tianyi Wang <[email protected]> rebase from main rebase branch from main
move transfer manager v2 integration tests to within module move putobject integ test to transfer manager module
add getobject integ test
a3113eb
to
684ca8e
Compare
Implement v2 s3 transfer manager's GetObject/DownloadObject api bound to single union client which mimics normal service client's initialization and api call. User could now download object sequentially with GetObject or asynchronously with DownloadObject.
Test: passed unit test for GetObject and DownloadObject, passed integration test of uploading object via s3 client and validating its content after downloading through v2 transfer manager.